1 Intro

Analysis of market access in Somalia. We will look at food market locations in Somalia and assess their accessibilty by regular sampled points.

2 Setup

This section for reproducibility. We use R in the version 4.2.2. In order to be able to reproduce this analysis you need an openrouteservice API key. Get one here. Put it in a text file file called config.txt in the working directory. It has no structure, just the key in the first line.

ZXCVBNMLKJHGFDSAQWERTYUIOP

2.1 libraries

Almost all of the libraries can be installed via CRAN with the command install.packages('packagename'). The openrouteservice package is not on CRAN and needs to be installed via github. With the command remotes::install_github("GIScience/openrouteservice-r"); Also tmap in its most recent version 4 is not yet available on CRAN so we need to download and install it via github too.

# main libraries
library(tidyverse)
library(glue)
library(sf)
library(units)
library(ggplot2)
library(tictoc)
library(openrouteservice) # remotes::install_github("GIScience/openrouteservice-r");
library(jsonlite)
library(terra)
library(exactextractr)
library(tmap) # remotes::install_github("r-tmap/tmap")
library(leaflet)
library(furrr)
library(purrr)
library(kableExtra)
library(mapsf)
library(RColorBrewer)
library(geonode4R)
library(scales)


#config <- readLines("config.txt")
#api_key <- config[1]
#user <- config[2]
#password <- config[3]
api_key <- ""
# Function to  adjust a list of coordinates to the location of the closest road segment via the snapping endpoint of ORS. ORS itself only allows for a maximum distance of 350m to snap.
ors_snap <- function(x, rowids, local=F) {
  x=coord_list
  rowids=grid5km$rowid
  local = T
  
  library(httr)
  
  # Define the body
  body <- list(
    locations = x,
    radius = 100000000 # apparently the maximum snapping distance is 5km
  )
  
  # Define the headers
  headers <- c(
    'Accept' = 'application/json, application/geo+json, application/gpx+xml, img/png; charset=utf-8',
    'Authorization' = api_key,
    'Content-Type' = 'application/json; charset=utf-8'
  )
  
  endpoint <- if(local){'http://0.0.0.0:8080/ors/v2/snap/driving-car'} else {'https://api.openrouteservice.org/v2/snap/driving-car'}
  # Make the POST request
  response <- POST( endpoint,
                    #'https://api.openrouteservice.org/v2/snap/driving-car', 
                    body = body, 
                    add_headers(headers), 
                    encode = "json")
  
  resp_content <- content(response)
  
  extract_data <- function(x, index) {
    if (is.null(x)) {
      return(data.frame(
        rowid = index,
        lon = NA,
        lat = NA,
        snapped_distance = NA
      ))
    } else {
      return(data.frame(
        rowid = index,
        lon = x$location[[1]],
        lat = x$location[[2]],
        snapped_distance = x$snapped_distance
      ))
    }
  }
  
  # Extract data from the list and create a dataframe
  df <- do.call(rbind, lapply(seq_along(resp_content$locations), function(i) extract_data(resp_content$locations[[i]], i)))
  df$rowid <- rowids
  # Convert the dataframe to an sf object
  sf_df <- df |> 
    drop_na(lon, lat) |>  # Drop rows with NA coordinates
    st_as_sf(coords = c("lon", "lat"), crs = 4326) |>  # Define the coordinate columns and set CRS
    st_sf()
  
  return(sf_df)
  
}

3 Processing pipeline

The workflow is as follows:

  • prepare market location data
  • generate hexagonal 5km grid in Somalia
  • add worldpop data per grid cell
  • snap the grid centroids to the nearest road segment
  • create a matrix for the distance and time for each market to the snapped centroids
  • detect the closest market for each grid cell

By snapping we refer to the process of changing the location of an origin or destination of a route to the nearest road segment. This is necessary because the openrouteservice API only allows for a maximum distance of 350m to snap. However in the context of Somalia greater snapping distances are desired as the road density in some regions might be low.

3.1 Data input

We use the following datasets as inputs:

  • pop: WorldPop population counts constrained
  • market locations data
  • admin boundaries level 2
pop <- rast('data/som_ppp_2020_UNadj_constrained.tif')
markets_som <- read_csv("data/wfp_food_prices_som.csv")
admin <- st_read('data/Som_Admbnda_Adm2_UNDP.shp', quiet = T)

3.2 Markets preprocessing

In order to use the market locations we need to create a sf object by converting the latitude and longitude columns to a point geometry column. By checking the distinct locations of the geometry column we got 44 food markets in Somalia.

# Clean the latitude and longitude columns
markets_som_clean <- markets_som |> 
  filter(!is.na(latitude) & !is.na(longitude)) |> 
  slice(-1) |> 
  mutate(latitude = as.numeric(latitude),
         longitude = as.numeric(longitude),
         date = as.Date(date, format = "%Y-%m-%d"))
         
#add year
markets_som_clean <- markets_som_clean |> 
  mutate(year = as.numeric(format(date, "%Y")))

# Convert to sf object
markets_som_sf <- markets_som_clean |> 
  st_as_sf(coords = c("longitude", "latitude"), crs = 4326) 

#unique markets
markets_som_asc_sf <- markets_som_sf |> 
  distinct(geometry, .keep_all = TRUE) 

#unique markets descending
markets_som_desc_sf <- markets_som_sf |> 
  arrange(desc(row_number())) |>    # Sort the dataframe such that the last rows come first
  distinct(geometry, .keep_all = TRUE)

#combine both
markets_som_sf <- markets_som_desc_sf |> left_join(markets_som_asc_sf |> st_drop_geometry() |> select(market, year), by ="market")

#prepare final market sf
markets_som_sf <- markets_som_sf |> rename(first_year = year.y, last_year = year.x) |> 
  select(-date) |>
  mutate(rowid = row_number())

4 Grid sampling

Finding the shortest/fastest way between a number of origins and destinations is a job for the Matrix endpoint of ors. Lets start with the grid sampling. We decide for a 5000m width hexagonal grid. A hexagonal grid is closer to a circle and therefore the better joice for our analysis. Within each grid cell we will use the centroid and search for road network locations to snap to.

# create 5km grid
grid5km <-
  st_make_grid(admin |> st_transform(20538), cellsize = 5000, square = F, flat_topped = T)
# convert to sf df
grid5km <- grid5km |> st_as_sf()
names(grid5km) <- 'geometry'
st_geometry(grid5km) <- 'geometry'
# get rid of all cells that are not within somalia
grid5km <-
  grid5km |> st_transform(4326) |> st_filter(admin |> st_union())
# join admin codes
grid5km <-
  grid5km |> st_join(admin |> select(admin2Name, admin2Pcod, admin1Name, admin1Pcod, admin0Name, admin0Pcod),
                     left = T,
                     largest = T)
# join pop information
grid5km$pop <- exact_extract(pop, grid5km, 'sum', progress = F)

# add rowid
grid5km <- grid5km |> mutate(rowid = row_number())

#filter for pop > 0
grid5km <- grid5km |> filter(pop > 0)

Time for this code chunk to run: 323.11 seconds

4.1 Snapping

The grid is created, now snap the centroids.

# convert
coord_list <-
  lapply(grid5km$geometry |> st_centroid(), function(x)
    c(st_coordinates(x)[, 1], st_coordinates(x)[, 2]))

# snap the list of cords
snapped_coords <- ors_snap(coord_list,rowids=grid5km$rowid, local = T)

# join back
grid5km_snapped <- grid5km |>
  left_join(snapped_coords |> tibble(), by = "rowid") |>
  mutate(centroid = st_centroid(geometry.x))

# refactor names in order to keep, the grid geometry, centorid and snapped centroid
names(grid5km_snapped)[10:11] <- c('geometry', 'snapped_centroid')
st_geometry(grid5km_snapped) <- 'geometry'

grid5km_snapped <- grid5km_snapped |> mutate(snapped = ifelse(is.na(snapped_distance), F, T)) 

#filter for snapped values
grid5km_snapped_NA <- grid5km_snapped |> filter(snapped == F)
grid5km_snapped_notNA <- grid5km_snapped |> filter(snapped == T)

Time for this code chunk to run: 22.18 seconds

Awesome we now got 17738 snapped and 3045 not snapped grid cells.

4.2 ORS matrix request and post processing

In the next chunk we will calculate the travel times and distances from each grid cell to each market. We will use the openrouteservice API for this.

## run 1 / 44: 5.313 sec elapsed
## run 2 / 44: 5.063 sec elapsed
## run 3 / 44: 4.532 sec elapsed
## run 4 / 44: 5.419 sec elapsed
## run 5 / 44: 5.091 sec elapsed
## run 6 / 44: 4.777 sec elapsed
## run 7 / 44: 4.941 sec elapsed
## run 8 / 44: 4.634 sec elapsed
## run 9 / 44: 4.092 sec elapsed
## run 10 / 44: 5.024 sec elapsed
## run 11 / 44: 6.178 sec elapsed
## run 12 / 44: 4.925 sec elapsed
## run 13 / 44: 4.724 sec elapsed
## run 14 / 44: 5.053 sec elapsed
## run 15 / 44: 4.62 sec elapsed
## run 16 / 44: 4.523 sec elapsed
## run 17 / 44: 5.523 sec elapsed
## run 18 / 44: 5.734 sec elapsed
## run 19 / 44: 4.811 sec elapsed
## run 20 / 44: 4.703 sec elapsed
## run 21 / 44: 4.855 sec elapsed
## run 22 / 44: 5.113 sec elapsed
## run 23 / 44: 4.572 sec elapsed
## run 24 / 44: 5.971 sec elapsed
## run 25 / 44: 7.065 sec elapsed
## run 26 / 44: 4.897 sec elapsed
## run 27 / 44: 5.225 sec elapsed
## run 28 / 44: 4.904 sec elapsed
## run 29 / 44: 4.302 sec elapsed
## run 30 / 44: 5.031 sec elapsed
## run 31 / 44: 6.649 sec elapsed
## run 32 / 44: 5.31 sec elapsed
## run 33 / 44: 4.482 sec elapsed
## run 34 / 44: 5.13 sec elapsed
## run 35 / 44: 4.295 sec elapsed
## run 36 / 44: 4.818 sec elapsed
## run 37 / 44: 4.889 sec elapsed
## run 38 / 44: 4.497 sec elapsed
## run 39 / 44: 4.831 sec elapsed
## run 40 / 44: 4.984 sec elapsed
## run 41 / 44: 4.568 sec elapsed
## run 42 / 44: 4.786 sec elapsed
## run 43 / 44: 4.73 sec elapsed
## run 44 / 44: 4.452 sec elapsed

Awesome we now have the distances from all 44 markets to all grids. The resulting dataframe has 780472 rows. Now we want to find the market that is closest from every grid cell in terms of traveltime.

5 Visualization and Assessment

In the following chunks some visualizations are executed to assess the results.

5.1 Non Spatial

What is the relation between Traveltime and population for each grid cell?

5.2 Spatial

Maps of population distribution, travel time to the closest market and the closest market with respective administrative boundaries and the markets locations.

5.2.1 Population distribution

## tmap mode set to 'view'

5.2.2 Traveltime to closest market

5.2.3 Closest market

6 Output

Grid export in geopackage format.